pyannote.metrics: A Toolkit for Reproducible Evaluation, Diagnostic, and Error Analysis of Speaker Diarization Systems
نویسنده
چکیده
pyannote.metrics is an open-source Python library aimed at researchers working in the wide area of speaker diarization. It provides a command line interface (CLI) to improve reproducibility and comparison of speaker diarization research results. Through its application programming interface (API), a large set of evaluation metrics is available for diagnostic purposes of all modules of typical speaker diarization pipelines (speech activity detection, speaker change detection, clustering, and identification). Finally, thanks to visualization capabilities, we show that it can also be used for detailed error analysis purposes. pyannote.metrics can be downloaded from http://pyannote.github.io.
منابع مشابه
The AMI Speaker Diarization System for NIST RT06s Meeting Data
We describe the systems submitted to the NIST RT06s evaluation for the Speech Activity Detection (SAD) and Speaker Diarization (SPKR) tasks. For speech activity detection, a new analysis methodology is presented that generalizes the Detection Erorr Tradeoff analysis commonly used in speaker detection tasks. The speaker diarization systems are based on the TNO and ICSI system submitted for RT05s...
متن کاملDiarTk : An Open Source Toolkit for Research in Multistream Speaker Diarization and its Application to Meetings Recordings
The speaker diarization task consists of inferring “who spoke when” in an audio stream without any prior knowledge and has been object of several NIST international evaluation campaigns is last years. A common trend for improving performances has been the use of several different feature streams as diverse as speaker location features, visual features or noise robust acoustic features. This pap...
متن کاملImproving speaker segmentation via speaker identification and text segmentation
Speaker segmentation is an essential part of a speaker diarization system. Common segmentation systems usually miss speaker change points when speakers switch fast. These errors seriously confuse the following speaker clustering step and result in high overall speaker diarization error rates. In this paper two methods are proposed to deal with this problem: The first approach uses speaker ident...
متن کاملImproving Speaker Diarization
This paper describes the LIMSI speaker diarization system used in the RT-04F evaluation. The RT-04F system builds upon the LIMSI baseline data partitioner, which is used in the broadcast news transcription system. This partitioner provides a high cluster purity but has a tendency to split the data from a speaker into several clusters when there is a large quantity of data for the speaker. In th...
متن کاملThe IBM RT07 Evaluation Systems for Speaker Diarization on Lecture Meetings
We present the IBM systems for the Rich Transcription 2007 (RT07) speaker diarization evaluation task on lecture meeting data. We first overview our baseline system that was developed last year, as part of our speech-to-text system for the RT06s evaluation. We then present a number of simple schemes considered this year in our effort to improve speaker diarization performance, namely: (i) A bet...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2017